potential gain
Influence Maximization via Graph Neural Bandits
Feng, Yuting, Tan, Vincent Y. F., Cautis, Bogdan
We consider a ubiquitous scenario in the study of Influence Maximization (IM), in which there is limited knowledge about the topology of the diffusion network. We set the IM problem in a multi-round diffusion campaign, aiming to maximize the number of distinct users that are influenced. Leveraging the capability of bandit algorithms to effectively balance the objectives of exploration and exploitation, as well as the expressivity of neural networks, our study explores the application of neural bandit algorithms to the IM problem. We propose the framework IM-GNB (Influence Maximization with Graph Neural Bandits), where we provide an estimate of the users' probabilities of being influenced by influencers (also known as diffusion seeds). This initial estimate forms the basis for constructing both an exploitation graph and an exploration one. Subsequently, IM-GNB handles the exploration-exploitation tradeoff, by selecting seed nodes in real-time using Graph Convolutional Networks (GCN), in which the pre-estimated graphs are employed to refine the influencers' estimated rewards in each contextual setting. Through extensive experiments on two large real-world datasets, we demonstrate the effectiveness of IM-GNB compared with other baseline methods, significantly improving the spread outcome of such diffusion campaigns, when the underlying network is unknown.
Neural Exploitation and Exploration of Contextual Bandits
Ban, Yikun, Yan, Yuchen, Banerjee, Arindam, He, Jingrui
In this paper, we study utilizing neural networks for the exploitation and exploration of contextual multi-armed bandits. Contextual multi-armed bandits have been studied for decades with various applications. To solve the exploitation-exploration trade-off in bandits, there are three main techniques: epsilon-greedy, Thompson Sampling (TS), and Upper Confidence Bound (UCB). In recent literature, a series of neural bandit algorithms have been proposed to adapt to the non-linear reward function, combined with TS or UCB strategies for exploration. In this paper, instead of calculating a large-deviation based statistical bound for exploration like previous methods, we propose, ``EE-Net,'' a novel neural-based exploitation and exploration strategy. In addition to using a neural network (Exploitation network) to learn the reward function, EE-Net uses another neural network (Exploration network) to adaptively learn the potential gains compared to the currently estimated reward for exploration. We provide an instance-based $\widetilde{\mathcal{O}}(\sqrt{T})$ regret upper bound for EE-Net and show that EE-Net outperforms related linear and neural contextual bandit baselines on real-world datasets.
EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits
Ban, Yikun, Yan, Yuchen, Banerjee, Arindam, He, Jingrui
Contextual multi-armed bandits have been studied for decades and adapted to various applications such as online advertising and personalized recommendation. To solve the exploitation-exploration tradeoff in bandits, there are three main techniques: epsilon-greedy, Thompson Sampling (TS), and Upper Confidence Bound (UCB). In recent literature, linear contextual bandits have adopted ridge regression to estimate the reward function and combine it with TS or UCB strategies for exploration. However, this line of works explicitly assumes the reward is based on a linear function of arm vectors, which may not be true in real-world datasets. To overcome this challenge, a series of neural-based bandit algorithms have been proposed, where a neural network is assigned to learn the underlying reward function and TS or UCB are adapted for exploration. In this paper, we propose "EE-Net", a neural-based bandit approach with a novel exploration strategy. In addition to utilizing a neural network (Exploitation network) to learn the reward function, EE-Net adopts another neural network (Exploration network) to adaptively learn potential gains compared to currently estimated reward. Then, a decision-maker is constructed to combine the outputs from the Exploitation and Exploration networks. We prove that EE-Net achieves $\mathcal{O}(\sqrt{T\log T})$ regret, which is tighter than existing state-of-the-art neural bandit algorithms ($\mathcal{O}(\sqrt{T}\log T)$ for both UCB-based and TS-based). Through extensive experiments on four real-world datasets, we show that EE-Net outperforms existing linear and neural bandit approaches.
Millions of UK workers at risk of being replaced by robots, study says
More than 10 million UK workers are at high risk of being replaced by robots within 15 years as the automation of routine tasks gathers pace in a new machine age. A report by the consultancy firm PwC found that 30% of jobs in Britain were potentially under threat from breakthroughs in artificial intelligence (AI). In some sectors half the jobs could go. The report predicted that automation would boost productivity and create fresh job opportunities, but it said action was needed to prevent the widening of inequality that would result from robots increasingly being used for low-skill tasks. PwC said 2.25 million jobs were at high risk in wholesale and retailing โ the sector that employs most people in the UK โ and 1.2 million were under threat in manufacturing, 1.1 million in administrative and support services and 950,000 in transport and storage.
Is AI being oversold?
It was oversold in the past. Easy to see why โ the potential gains were (and still are) enormous and any indication you were on the right track meant people would throw money at you. And if you then couldn't deliver anything monetizable, the money people would shred you and your reputation. DL systems have achieved near-human performance in at least 6 problem domains (signal processing, low-level speech understanding, image understanding, text understanding, Atari games, and Go). From now on, we can use incremental improvements and we can reliably measure our progress.